Performance Advantage of the Register Stack in Intel® ItaniumTM Processors

نویسندگان

Ryan Rakvic

Ed Grochowski

Bryan Black

Murali Annavaram

Trung Diep

John P. Shen

چکیده

The Intel® ItaniumTM architecture provides a virtual register stack of unlimited size for use by software. New virtual registers are allocated on a procedure call and deallocated on return. Itanium processors implement the register stack by means of a large physical register file, a mapping from virtual to physical registers, and a Register Stack Engine (RSE) that saves and restores the contents of the physical registers to memory without explicit program intervention. The combination of these features significantly reduces the number of loads and stores required to save registers across procedure calls compared to a conventional architecture. In this paper, we show that the Itanium register stack reduces load and store traffic to the stack by at least a factor of three across select SpecInt2000 and Oracle database benchmarks. Furthermore, we examine the effects of the register stack on data cache miss rates and program execution time. When compared to a conventional architecture, the Itanium architecture on average achieves 7%-8.3% and 10.2%-12% performance advantage on in-order and out-of-order processor models, respectively, as a result of the register stack. Finally we analyze the vitality of stack loads and show that in general few stack loads are vital in an in-order model. However, a larger percentage of stack loads become vital in the out-of-order model leading to a greater performance benefit from the register stack.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Intel EPIC/Itanium2 Architecture for Forth

Forth is a stack machine that represents a good match for the register stack of the Explicit Parallel Instruction Computer (EPIC) architecture. In this paper we will introduce a new calling mechanism using the register stack to implement a Forth system more efficiently. Based upon our performance measurements, we will show that the new calling mechanism is a promising technique to improve the p...

متن کامل

Compiler Controlled Register Stack Management for the Intel

Intel Itanium processors were designed with an on chip register stack engine (RSE) in order to reduce the overhead related to procedure call boundaries. The RSE automatically preserves values stored in stacked registers across procedure invocations. This architecture model significantly reduces the amount of spill code necessary to maintain an application’s state, which in turn reduces memory t...

متن کامل

Optimization for the Intel

The Intel R © Itanium R © architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium R © architecture provides a compilercontrollable virtual register stack to reduce the ...

متن کامل

Exploiting Intra-function Correlation with the Global History Stack

The demand for more computation power in high-end embedded systems has put embedded processors on parallel evolution track as the RISC processors. Caches and deeper pipelines are standard features on recent embedded microprocessors. As a result of this, some of the performance penalties associated with branch instructions in RISC processors are becoming more prevalent in these processors. As is...

متن کامل

Bit Swapping Linear Feedback Shift Register For Low Power Application Using 130nm Complementary Metal Oxide Semiconductor Technology (TECHNICAL NOTE)

Bit swapping linear feedback shift register (BS-LFSR) is employed in a conventional linear feedback shirt register (LFSR) to reduce its power dissipation and enhance its performance. In this paper, an enhanced BS-LFSR for low power application is proposed. To achieve low power dissipation, the proposed BS-LFSR introduced the stacking technique to reduce leakage current. In addition, three diffe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Performance Advantage of the Register Stack in Intel® ItaniumTM Processors

نویسندگان

چکیده

منابع مشابه

Optimizing Intel EPIC/Itanium2 Architecture for Forth

Compiler Controlled Register Stack Management for the Intel

Optimization for the Intel

Exploiting Intra-function Correlation with the Global History Stack

Bit Swapping Linear Feedback Shift Register For Low Power Application Using 130nm Complementary Metal Oxide Semiconductor Technology (TECHNICAL NOTE)

عنوان ژورنال:

اشتراک گذاری